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Abstract. Generalized network tomography (GNT) deals with estimation of link performance 
parameters for networks with arbitrary topologies using only end-to-end path measurements of pure 
unicast probe packets. In this paper, by taking advantage of the properties of generalized hyperex- 
ponential distributions and polynomial systems, a novel algorithm to infer the complete link metric 
distributions under the framework of GNT is developed. The significant advantages of this algorithm 
are that it does not require: i) the path measurements to be synchronous and ii) any prior knowledge 
of the link metric distributions. Moreover, if the path-link matrix of the network has the property 
that every pair of its columns are linearly independent, then it is shown that the algorithm can 
uniquely identify the link metric distributions up to any desired accuracy. Matlab based simulations 
have been included to illustrate the potential of the proposed scheme. 

Key words, generalized network tomography, generalized hyperexponential distributions, uni- 
cast measurements, moment estimation, polynomial systems 

AMS subject classifications. Primary, 47A50; Secondary, 47A52, 62J99, 65F30, 65C60, 
68M10, 68M20 

1. Introduction. The present age Internet is a massive, heterogeneous network 
of networks with a decentralized control. Despite this, accurate, timely and localized 
information about its connectivity, bandwidth and performance measures such as 
average delay experienced by traffic, packet loss rates across links, etc. is extremely 
vital for its efficient management. Brute force techniques, such as gathering the 
requisite information directly, impose an impractical overhead and hence are generally 
avoided. This necessitated the advent of network tomography — the science of inferring 
spatially localized network behaviour using only end-to-end aggregate metrics. 

Recent advances in network tomography can be classified into two broad strands: 
i) traffic demand tomography — determination of source-destination traffic volumes 
via measurements of link volumes and ii) network delay tomography — link parameter 
estimation based on end-to-end path level measurements. For the first strand, see 
[27J S3 Under the second strand, the major problems studied include estimation 
of bottleneck link bandwidths, e.g. [HI [9], link loss rates, e.g. [3], link delays, e.g. 
[7J [22l H31 EH H], etc. Apart from these, there is also work on estimation of the 
topology of the network via path measurements. For excellent tutorials and surveys 
on the state of the art, see [TJ [7J El E]. For sake of definiteness, we consider here 
the problem of network delay tomography. The proposed solution is, however, also 
applicable to traffic demand tomography. 

Given a binary matrix A, usually called the path-link matrix, the central problem 
in network delay tomography, in abstract terms, is to accurately estimate the statistics 
of the vector X from the measurement model Y = AX. Based on this, existing 
work can be categorized into deterministic and stochastic approaches. Deterministic 
approaches, e.g. [HI [6l [10], treat X as a fixed but unknown vector and use linear 
algebraic techniques to solve for X. Clearly, when no prior knowledge is available, 
X can be uniquely recovered only when A is invertible, a condition often violated in 
practice. Stochastic approaches, e.g. [51 [231 HS1 US] , on the other hand, assume X to 
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be a non-negative random vector of mutually independent components and employ 
parametric/non-parametric estimation techniques to infer the statistical properties of 
X using samples of Y. In this paper, we build a stochastic network tomography scheme 
and establish sufficient conditions on A for accurate identification of the distribution 
of X. 

Stochastic network tomography approaches, in general, model the distribution 
of each component of X using either a discrete distribution, e.g. [26l [29], or a fi- 
nite mixture model, e.g. [23] . They construct an optimization problem based 
on the characteristic function, e.g. [S], or a suitably chosen likelihood function, e.g. 
[231123112] j of Y. Algorithms such as expectation-maximization, e.g. [23J US I2S] , gen- 
eralized method of moments, e.g. [S], etc., which mainly exploit the correlations in 
the components of Y, are then employed to determine the optimal statistical estimates 
of X. In practice, however, these algorithms suffer two main limitations. Firstly, note 
that these algorithms utilize directly the samples of the vector Y. Thus, to implement 
them, one would crucially require i) end-to-end data generated using multicast probe 
packets, real or emulated, and ii) the network to be a tree rooted at a single sender 
with destinations at leaves. Divergence in either of the above requirements, which 
is often the case, thus results in performance degradation. Secondly, the optimiza- 
tion problems considered tend to have multiple local optima. Thus, without prior 
knowledge, the quality of the estimate is difficult to ascertain. 

In this paper, we consider the problem of generalized network tomography (GNT) 
wherein, the objective is to estimate the link performance parameters for networks 
with arbitrary topologies using only end-to-end measurements of pure unicast probe 
packets. Mathematically, given a binary matrix A, we propose a novel method, hence- 
forth called the distribution tomography (DT) scheme, to accurately estimate the 
distribution of X, a vector of independent non- negative random variables, using only 
IID samples of the components of the random vector Y — AX. In fact, our scheme 
does not even require prior knowledge of the distribution of X. We thus overcome the 
limitations of the previous approaches. 

We rely on the fact that the class of generalized hyperexponential (GH) distribu- 
tions is dense in the set of non-negative distributions (see [3]). Using this, the idea 
is to approximate the distribution of each component of X using linear combinations 
of known exponential bases and estimate the unknown weights. These weights are 
obtained by solving a set of polynomial systems based on the moment generating 
function of the components of Y. For unique identifiability, it is only required that 
every pair of columns of the matrix A be linearly independent, a property that holds 
true for the path-link matrix of all multicast tree networks and more. 

The rest of the paper is organized as follows. In the next section, we develop 
the notation and formally describe the problem. Section [3] recaps the theory of ap- 
proximating non-negative distributions using linear combinations of exponentials. In 
Sections [4] and [5] we develop our proposed method and demonstrate its universal ap- 
plicability. We give numerical examples in Section [6] and end with a short discussion 
in Section 

We highlight at the outset that the aim of this paper is to establish the theoeretical 
justification for the proposed scheme. The numerical examples presented are only for 
illustrative purposes. 

2. Model and Problem Description. Any cumulative distribution function 
(CDF) that we work with is always assumed to be continuous with support (0, oo). 
The moment generating function (MGF) of the random variable X will be Mx[t) = 
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E(exp(— tX)). For n £ Z ++ , we use [n] and S n to represent respectively the set 
{1, . . . , n} and its permutation group. We use ( fc k ™ k J to represent fcl! "' fcd! ■ Simi- 
larly, (£) stands for k ,^Lk)\ ■ We use the notation R,M + and M ++ to denote respec- 
tively the set of real numbers, non-negative real numbers and strictly positive real 
numbers. In the same spirit, for integers, we use Z,Z + and Z ++ . All vectors are col- 
umn vectors and their lengths refer to the usual Euclidean norm. For S > 0, B(v; 6) 
represents the open S— ball around the vector v. To denote the derivative of the map 
/ with respect to x, we use /(x). Lastly, all empty sums and empty products equal 
and 1 respectively. 

Let Xx, . . . ,Jf/v denote the independent non- negative random variables whose 
distribution we wish to estimate. We assume that each Xj has a GH distribution of 
the form 

(2.1) F {u) = i; 1 W , fc [l-exp(-A feW )], u>0 

fc=i 

where A^ > 0, X)fc=i w jk^k exp(— X k u) > and w jk = 1- Note that the weights 

{wjk}, unlike for hyperexponential distributions, are not required to be all positive. 
Further, we suppose that Ai,...,A<2+i are distinct and explicitly known and that 
the weight vectors of distinct random variables differ at least in one component. 
Let A £ {0, l} mxAr denote an a priori known matrix which is 1— identifiable in the 
following sense. 

Definition 2.1. A matrix A is fc— identifiable if every set of 2k of its columns 
is linearly independent. 

Let X = (X-i,...,X N ) and Y = AX. For each i £ [m], let p, := {j £ [N] : 
o-ij — !}■ Further, we presume that, Vi £ [m], we have access to a sequence of IID 
samples {5^;};>i of Yj.. Our problem then is to estimate for each Xj, its vector of 
weights Wj = (Wji, . . . ,Wjd) and consequently its complete distribution Fj, since 

Wj(d+1) = 1 - Tl=lWjk- 

Before developing the estimation procedure, we begin by making a case for the 



distribution model of (2.1) 



3. Approximating distribution functions. Let Q = {G 1 , . . . , G N } denote a 
finite family of arbitrary non-negative distributions. For the problem of simultane- 
ously estimating all member of Q a useful strategy, as we demonstrate now, is to 
approximate each G 1 by a GH distribution. 

Recall that the CDF of a GH random variable X is given by 

d+1 

(3.1) J F 1 j f (w) = ^a fc [l-exp(-A fc w)], u > 0, 

k=l 

where A& > 0, J^tti a k^k exp(— A^u) > and J2k=\ a k = 1- Consequently, its MGF 
is given by 

d+i . 

(3.2) M x (t) = J2<*k ' U 



k—l 



In addition to the simple algebraic form of the above quantities, the other major 
reason to use the GH class is that, in the sense of weak topology, it is dense in the 



4 



GUGAN THOPPE 



set of all non- negative distributions (see [2]). In fact, as the following result from [21] 
shows, we have much more. 

Theorem 3.1. For n,k £ let X ni j~ be a nonnegative GH random variable 

with mean k/n, variance k and CDF W n> u- Suppose 

1. the function v : Z ++ — » Z ++ satisfies lim v(n)/n = oo. 

n— > oo 

2. there exists < s < 1 such that lim n 1+s af . Ik = uniformly with respect 

71— >CX> 

to k. 

Then given any continuous non-negative distribution function F, the following holds: 

1. the function F n given by 

v{n) 

F n {u) = {F(k/n) - F((k - l)/n)} W n , k (u) 

k=l 

+(l-F{v(n)/n))W n<v{n)+1 (u) 

is a GH distribution for every n £ and 

2. F n converges uniformly to F, i.e., 

lim sup \F n (u) -F(u)\ = 0. 

n— ¥oo _ oo<u<cjo 

Observe that, for each n, the exponential stage parameters of F n depend only on 
the choice of the random variables {X n ,k '■ 1 < k < v(n) + 1}. 

What this observation and the above result imply in relation to Q is that if we fix 
the random variables {X Ut k} and let M l denote the MGF of G l , then for any given 
ei,£2 > and any finite set r = {tx, . . . , t^} C K+, 3n = n(ei,e2,r) G Z ++ such 
that for each i £ [N], G l and its n th GH approximation, F£, are e%— close in the sup 
norm and for each j £ [k], \M l (tj) — M^(tj)\ < e-i. Further, the exponential stage 
parameters are explicitly known and identical across the approximations F*, . . . , F^ , 



which now justifies our model of (2.1 1 



The problem of estimating the individual members of Q can thus be reduced to 
determining the vector of weights that characterizes each approximation and hence 
each distribution. 

4. Distribution tomography scheme. The outline for this section is as fol- 
lows. For each i £ [m], we use the IID samples of Yi to estimate its MGF and 
subsequently build a polynomial system, say Hi(x) = 0. We call this the elementary 
polynomial system (EPS). We then show that for each i £ [m] and each j £ pi, a close 
approximation of the vector Wj is present in the solution set of Fti(x) = 0, denoted 
V{Hj). To match the weight vectors to the corresponding random variables, we make 
use of the fact that A is 1— identifiable. 

4.1. Construction of elementary polynomial systems. Fix an arbitrary 
i £ [m] and suppose that \pi\ — Ni, i.e., Yi is a sum of iVj random variables, which 
for notational convenience, we relabel as X\, . . . , X^. Because of independence of the 
random variables, observe that the MGF of Y^ is well defined Vi £ K ++ and satisfies 
the relation 



(4.1) M Yi (t) 
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On simplification, after substituting Wju+i) = 1 — Y^k=i w jk, we get 



(4.2) 



' d 



WjkAk(t) + A dH 



l Lfc=i 



where A fc (t) = (A* - X d +i)t/(X k + 1) and fM (t) = M Y . (t)(X d+1 + t) N * . 

For now, let us assume that we know My t (t) and hence \Xi (t) exactly for every 
valid t. We will refer henceforth to this situation as the ideal case. Treating t as a 
parameter, we can then use (4.2) to define a canonical polynomial 

(4.3) 



/(x;f) = 



d 

E 

k=l 



Xjk^k(t) + A d4 



where x = (x 1; . . . , x^yj with Xj = feji, . . . , Sjd). As this is a multivariate map in 
d ■ Ni variables, we can choose an arbitrary set r = {t%, . ..,t<j.jv ( } C K++ consisting 
of distinct numbers and define an intermediate square polynomial system 



(4.4) 

where /fc(x) 



^ T (x) = (/ 1 (x),...,/ d . JVi (x))=0, 



/(x;i fe ). 



Since (4.4 1 depends on choice of r, analyzing it directly is difficult. But observe 
that i) the expansion of each /„ or equivalently (4.3| results in rational coefficients in 
t of the form A" 1 (*)■•• A^ d (t)A^+ 1 , where, for each k, uj k E and YltX <*>k = N u 
and ii) the monomials that constitute each /„ are identical. This suggests that one 
may be able to get a simpler representation for (4.4). We do so in the following three 
steps, where the first two focus on simplifying (4.3). 

Stepl-Gather terms with common coefficients: Let u) denote the d— dimensional vector 
. . . ,u!d) and let ft = (u},LJd+i)- Also, let 



k=l 



For a vector b = (pi, ■ ■ ■ , 6jvJ € [d + l] Nl , let its type be denoted by 0(b) = 
(#i(b), . . . , #d_|_i(b)), where 0k(b) is the count of the element k in b. For every 
O £ Ad+iNa additionally define the set 



Be 



{b = (h,...,b Ni ) G [d+l] N > : 6(b) = fl} 



and the polynomial g(x; SI) = EbeB« (Ilj^i, b^d+i x jbj)- For an y w € z +> let A " = 
A^ 1 ^) • • • A^ d (t). Then collecting terms with common coefficients in (4.3), the above 
notations help us rewrite it as 



(4.5) 



/(x;t) 



E 



ff (x;^)A-A^ +1 



"d+1 



oeA d 



Step 2- Coefficient expansion and regrouping: Using an idea similar to partial fraction 
expansion for rational functions in t, the goal here is to decompose each A" into 
simpler terms. For each j, fee [d], let 
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For each w G Z^, let X>(w) := {k G [d] : w fc > 0}. Further, if V(uj) ^ 0, then 
VA: G £>(<*;) and Vg G [w fc ], let 

d 

A fcg (u;) := {s = (s x , . . . , s d ) £ Z d + : s r = 0, Vr G V{u) c U {/c}; ^ s n = u k - q} and 

n=l 



(4.6) 7fe » : = n c e n rcT 1 )^ 

The desired decomposition is now given in the following result. 
Lemma 4.1. IfV(uj) ^ 0, then 

(4-7) A"= £ 5> 9 (u>)A 9 fe (<) 

/or a/Z i. Further, this expansion is unique. 
Proof. See Appendix |Bj □ 



By applying this expansion to each coefficient in (4.5) and regrouping, we have 

d Ni 

(4.8) /(x; t) = J2 E Mx)A^)A^ - c(t), 

fc = l q=l 

where c(f) = Hi{t) — A^f^ an d 

(4-9) M*) = E ng^ ff(x;n). 



We consider a simple example to better illustrate the above two steps. 

. 



Example 1. Suppose Ni = 3 and d = 2. In this case, clearly A k {t) = ^ Xk 



= (A 3 + t) 3 M Y ,(t) andx= (xu, X12, X21, X22, £31, 2:32). Equation (4-3 ) is 
3 

/(x;t) = J] [ajjiAxW + x, 2 A 2 (<) + A 3 ] - 
3=1 



wMe (4.5| is 

/(x; t) = X U X2lX 31 Al(t) + X 12 X 22 X 32 A?,(t) + 

(xnx 21 + xnx 31 + x 2 ix 31 )A 2 1 (t)X 3 + 

(xi 2 X 2 2 + Xi 2 X 32 + X 22 X 32 )Al(t)\ 3 + 

(xnx 2 ix 32 + xux 22 x 3 i + xi 2 x 2 ix 3 i) A\{t)A 2 (t) + 

(xnx 22 x 32 + xi 2 x 2 ix 3 2 + xi 2 x 22 x 3 i) Ai{t)A\(t) + 

(xu + x 2 i + x 31 ) Ai(t)A3 + (x 12 + x 22 + x 32 ) A 2 (t)\l + 

(xnx 22 + X11X32 + x 12 x 2 i + x 2 ix 32 + x i2 x 3 i + x 22 x 3 i) Ax(t)A 2 (t)X 3 
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Now observe that 



Ai(t)A 2 (t) =/3 12 A 1 (t) + ^ 21 A 2 (t), 



and 



A?(t)A 2 (t) = AaAf (t) + fafoiW) + /3 2 2 iA 2 (<) 
Ai(t)A^(t) = /3f 2 A x (t) + /3 2 iA|(t) + fafoiM*)- 



Substituting these identities, we can write /(x;i) m terms of (4.8) as 



/(x;t) = fc u (x)Ai(t)A§ + fti 2 (x)A?(t)A3 + /i 13 (x)A?(t) + 

&2i(x)A a (t)A§ + ^ 22 (x)A2(t)A 3 + /i 23 (x)A|(*) + A| - m(t), 



where 



/in(x) = xn + x 2 i + x 3 i + 12 2 21 (xnx 21 x 32 + xux 22 x 3 i + Xi 2 x 2 ix 3 i) 



12 



(xnx 22 x 32 + Xl 2 X 21 X 32 + x 12 x 22 x 31 ) + 



{xnx 22 + x 1± x 32 + x 12 x 2 i + x 21 x 32 + x 12 x 3 \ + x 22 x 31 ) 



/i i2 (x) = X11X21 + xux 3 i + x 21 x 3 i + — (xnx 21 x 32 + xnx 22 x 3 i + xi 2 x 21 x 3 i) , 

A3 



/ll 3 (x) = XnX 2 iX 31 , 



and so on. 



StepS-Eliminate dependence on r: The advantage of (4.8) is that, apart from e(t), the 



number of t-dependent coefficients equals d ■ N% which is exactly the number of un- 
knowns in the polynomial /. Further, as shown below, they are linearly independent. 

Lemma 4.2. For k € [d- iVj, let b k := min{j G [d] : j ■ Ni> k}. Then the matrix 
T T , where, for j, k € [d ■ iVj] 



(4.10) 



[l T ) jk -A b 



is non-singular. 

Proof. See Appendix [S] □ 

Observe that if we let c r = (c(ii), . . . , c(td.jvj), £fc(x) = (hki(x), . . . , h kNi (x.)) 
and £ (x) = (£i(x), . . . , ^(x)), then (4.4) can be equivalently expressed as 



(4.11) 



T T £(x) - c T = 0. 



Prcmultiplying (4.11 ) by (T T ) , which now exists by Lemma 4.2 we have 



(4.12) 



£(x)-(T T ) 1 c T = 0. 
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Clearly, w = (wi, . . . , wjvj is a root of (4.3) and hence of (4.12). This immediately 
implies that (T T ) 

(4.13) 



c T = £ (w) and consequently (4.12) can rewritten as 



Note that (4.13) is devoid of any reference to the set r and can be arrived at using 
any valid r. For this reason, we will henceforth refer to (4.13) as the EPS. 

Example 2. Let N^= d=2. Also, let Ai = 5, A 2 = 3 and A 3 = 1. Then the map 
£ described above is given by 



I £11 



(4.14) 



£(x) 



^12 + ^22 



V 



5(xna;22 
£112:21 
6(xna;22 
£12^22 



£l2£2l) \ 
' £l2£2l) 



Two useful properties of the EPS are stated next. For a £ Sjy 4 ! let x CT := 
(x^x), . . . ,x CT (jv.)) denote a permutation of the vectors x 1; . . . , xjv ( . Further, for any 
x £ R d ' N % let tt x := {x CT : o- e SjvJ. 

Lemma 4.3. -ff,:(x) = iJ^Xg.), Ver G Sjv 4 . TViai is, i/ie map Hi is symmetric. 

Proof. Observe that -ffi(x) = (T T )~ 1 F T (x) and F T , as defined in (4.4), is symmet- 
ric. The result thus follows. □ 

Lemma 4.4. There exists an open dense set TZi ofM. d ' Ni such that if w £ TZi, 
then \V(Hi)\ = k x Nil, where k £ Z ++ is independent of w £ TZi. Further, each 
solution is non-singular. 

Proof. See Appendix [C] □ 

We henceforth assume that w £ TZi. From the definition of the EPS in (4.13), it 
is obvious that w £ V(Hi). By Lemma 4.3 it also follows that if x* £ V(Hi), then 
7r x . C V(Hi). Hence, it suffices to work with 

-id 



(4.15) 



M t = {a£C d : 3x* £ V(Hi) with = a}. 
! wjVj} C Aii- A point to note here is that li 



Mi\Wi is not 



Clearly, Wi :={wi,.. 
empty in general. 

Our next objective is to develop the above theory for the case where for each 
i £ [m], instead of the exact value of My i (t), we have access only to the IID realizations 
{Yu}i>i of the random variable Yi. That is, for each k £ [Ni], we have to use the 



sample average My t L) — I X}j=i ex P( — tkYu)J / L for an appropriately chosen large 

L, c(t k ;L) = M Yi (tk)(^d+i +tk) Ni - >*d+i and £t,l = (c(t 1 ;L), . . . ,c(t d . m ; Lj) as 
substitutes for each M^(tfe), each c(tk) and c T respectively. But even then note that 
the noisy or the perturbed version of the EPS 



(4.16) 



i? l (x) = £(x)-(T r )- 1 (c r , L ) = 0. 



is always well defin ed. More importantly, the perturbation is only in its constant 



4.3 



it then follows that the map Hi is symmetric. 



term. As in Lemma 

Next observe that since TZi is open (see Lemma 4.4), there exists a small enough 
Si > such that B(w;5i) C TZi. Using the regularity of solutions of the EPS (see 
Lemma 4.4), the inverse function theorem then gives the following result. 

Lemma 4.5. Let S £ (0, Si) be such that for any two distinct solutions in V(Hi), 
say x* and y*, B(x*;<5) D B(y*;S) = 0. Then there exists an e(5) > such that if 
u £ Mr' Ni and ||u — £ (w)|| < e(S), then the solution set V(Hi) of the perturbed EPS 
£(x) — u = satisfies the following: 
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1. All roots in V(Hi) are regular points of the map £ . 

2. For each x* £ V(Hi), there is one and only one z* £ V{H,i) such that ||x* 
z*||<5. 

As a consequence, we have the following. 



Lemma 4.6. Let S and e(S) be as described in Lemma 4-5 Then for tolerable 
failure rate k > and the chosen set t, 3L T j lK € ^++ such that if L > L T j tK , then 
with probability greater than 1 — k, we have ||(T T ) c Ti £ — £ (w)|| < e(S). 

Proof. Note that exp(— tf-Yu) £ [0,1] Vi,Z and k. The Hoeffding inequality (see 
P] ) then shows that for any e > 0, Pr{\M Yz (t k ; L) - M Yi (tk)\ > e} < exp(-2e 2 L). 
bmce £(w) = (T T )- 1 c T , the result is now immediate. □ 



The above two results, in simple words, state that solving (4.16 ) for a large enough 
L, with high probability, is almost as good as solving the EPS of (4.13 ). For L > L T ^ K , 
let Aj(K ) denote the event IKlV) -1 ^ - £ (w)|| < e(S). Clearly, Pr{^?(«;)} < n. As 
in ( |4T5| , let 

(4.17) Mi = {a £ C d : 3z* £ V(Hi) with z* = a}. 

We are now done discussing the EPS for an arbitrary i £ [to]. In summary, we 
have managed to obtain a set M.i in which a close approximation of the weight vectors 
of random variables Xj that add up to give Yi are present with high probability. The 
next subsection takes a unified view of the solution sets {Mi : i £ [to]} to match the 
weight vectors to the corresponding random variables. But before that, we redefine 
Wi as {wj : j £ pi}. Accordingly, Mi, Mi, V(if,) and V(Hi) are also redefined using 
notations of Section [2] 

4.2. Parameter matching using 1-identifiability. We begin by giving a 
physical interpretation for the 1— identifiability condition of the matrix A. For this, 
let Gj := {i £ [to] : j £ p t } and Bj :— [m]\Qj. 

Lemma 4.7. For a I— identifiable matrix A, each index j £ [N] satisfies 

{j}= n n pt=-- v v 



Proof. By definition, j £ T>j. For converse, if k £T>j, k ^ j, then columns j and k 
of A are identical; contradicting its 1— identifiability condition. Thus {j} ~ T>j. □ 
An immediate result is the following. 

Corollary 4.8. Suppose A is a 1— identifiable matrix. If the map u : [TV] — > X, 
where X is an arbitrary set, is bijective and \/i £ [to], Vi := {u(j) : j £ Pi}, then for 
each j £ [N] 

Mm = n ^ n n ^ 



By reframing this, we get the following result. 

Theorem 4.9. Suppose A is a 1— identifiable matrix. If the weight vectors 
w 1; . . . ,Wjv are pairwise distinct, then the rule 

(4-18) V : j : ^ f| W 9 n fl W b> 

geGj beB, 

satisfies = Wj. 
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This result is where the complete potential of the 1— identifiability condition of 
A is being truly taken advantage of. What this states is that if we had access to the 
collection of sets {Wi : i £ [to]}, then using ip we would have been able to uniquely 
match the weight vectors to the random variables. But note that, at present, we have 
access only to the collection {Aii : i £ [m]} in the ideal case and {M.i : i £ [m]} in 
the perturbed case. In spite of this, we now show that if Vii, 12 £ [m], i± 7^ £2, 



(4.19) 



x u n Mi, = 



a condition that always held in simulation experiments, then the rules: 
(4.20) 



for the ideal case, and 
(4.21) 



$ : j -> f) M g n f| Ml 

gGQj bGBj 



in the perturbed case, with minor modifications recover the correct weight vector 
associated to each random variable Xj. 

We first discuss the ideal case. Let S := {j £ [N] : \Qj\ > 2}. Because of (4.191 
and Theorem |4.9( note that 

1. TijeS, then V(i) = {w^}. 

2. I f j £ S c and j £ -pi* , then tp(j) = {w^-} U l l * . 



That is, (4.20) works perfectly fine when j £ S. The problem arises only when j £ S c 
as ij)(J) does not give as output a unique vector. To correct this, fix j £ S c . If 
j £ p^, then let v sub = (wj; : k £ Pi*\{j})- Because of 1— identifiability, note that if 



k £ Pi*\{j}, then k £ S. From ([43) and ( |44| ), it is also clear that (y sub ,a) £ V{Hi) 
if and only if a — Wj. This suggests that we need to match parameters in two stages. 
In stage 1, we use (4.201 to assign weight vectors to all those random variables Xj 
such that j £ S. In stage 2, for each j £ S c , we identify i* £ [m] such that j £ p^ . We 
then construct v sub . We then assign to j that unique a for which (v sub ,a) £ V(Hi»). 
Note that we are ignoring the trivial case where \pi-\ = 1. It is now clear that by 



using (4.201 with modifications as described above, at least for the ideal case, we can 



uniquely recover back for each random variable Xj its corresponding weight vector 



We next handle the case of noisy measurements. Let U := Ui e [ m ]Aii and U : 
Uie[ m i Observe that using (4.21 ) directly, with probability one, will satisfy i/>(j) 



for each j £ [N]. This happens because we are distinguishing across the solution 
sets the estimates obtained for a particular weight vector. Hence as a first step we 
need to defin e a rela tion ~ onM that associates these related elements. Recall from 
Lemmas 



4.5 



and 



4.6 



that the set A4i can be constructed for any small enough choice 



of 5, k > 0. With choice of S that satisfies 



(4.22) 



< 46 < min I 



let us consider the event A := n ig r m i„4j(/c/TO). Using a simple union bound, it follows 



that Pr{^4 c } < k. Now suppose that the event A is a success. Then by (4.22) and 
Lemma |4.5( the following observations follow trivially. 
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1. For each i £ [to] and each a £ Mi, there exists at least one d £ .Mj such 
that ||d — a|| < (5. 

2. For each i £ [to] and each d £ Mi, there exists precisely one a £ Mi such 
that ||& — a|| < 6. 

3. Suppose for distinct elements a, /3 £ 14, we have d, /3 € IA such that | |a — a\ | < 
5 and ||/3 -/3|| < <5. Then \\a - (3\\ > 2<5. 

From these, it is clear that the relation ~ on U should be 

(4.23) a ~ (3 iff \\a-P\\ < 25. 

It is also easy to see that, whenever the event A is a success, ~ defines an equivalence 
relation on U. For each i £ [m], the obvious idea then is to replace each element of 
Mi and its corresponding d— dimensional component in V{Hi) with its equivalence 



class. It now follows that (4.211, with modifications as was done for the ideal case, 
will satisfy 

(4.24) 4>(j) = {a£U:\\a-w j \\<5}. 

This is obviously the best we could have done starting from the set {Mi '■ i & [m]}. 

We end this section by summarizing our complete method in an algorithmic fash- 
ion. 

Algorithm 1. Distribution tomography 

Phase 1: Construct & Solve the EPS. 
For each i £ [m] , 

1. Choose an arbitrary r = {t\, . . . , td-Ni} of distinct positive real numbers. 

2. Pick a large enough L £ Set My^tj) — {^2a =1 exp(— £jly)J /L, /ii(tj) = 

(^d+i + tj) Ni MYi(tj) and c(tj) = fJ>i(tj) — for each j £ [Ni\. Using this, 
construct c T = (c(ti), . . . , c(td.jv 4 )). 

3. Solve £(x) — T~ x c T = using any standard solver for polynomial systems. 
I Build M l = {a £ C d : 3x* G V(Hi) with x* = a}. 

Phase 2: Parameter Matching 

1. Set U := Ui£\ m ]M.i. Choose 6 > small enough and define the relation ~ 

on U, where d ~ (3 if and only if \\a — ftW? < 26. If ~ is not an equivalence 
relation, then choose a smaller S and repeat. 

2. Construct the quotient set U\ ~ . Replace all elements of each Mi and each 
V(Hi) with their equivalence class. 

3. For each j £ S, set = (C] geg , M g ) fl {n b e B] ^b) ■ 
4- For each j £ S c , 

(a) Set i* = i £ [m] such that j £ . 

(b) Construct v stlb = (-0(fc) :k£ Pl *,k^ j}. 

(c) Set tp(j) = a such that (v sub ,d) G V{Hi). 

5. Universality. The crucial step in the DT scheme described above was to 
come up with, for each i £ [to], a well behaved polynomial system, i.e., one that 



satisfies the properties of Lemma 4.4 based solely on the samples of Yi. Once that 
was done, the ability to match parameters to the component random variables was 
only a consequence of the 1— idcntifiability condition of the matrix A. This suggests 
that it may be possible to develop similar schemes even in settings different to the 
ones assumed in Section [2] In fact, functions other than the MGF could also serve as 
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blueprints for constructing the polynomial system. We discuss in brief few of these 
ideas in this section. Note that we are making a preference for polynomial systems for 
the sole reason that there exist computationally efficient algorithms, see for example 
dH UH EU HI] , to determine all its roots. 

Consider the case, where Vj G [N], the distribution of Xj is the finite mixture 
model 

dj+l 

(5.1) Fj(u) = Wjk<t>jk(u), 

fe=i 

where dj G wji, . . . , uij(di+i) denote the mixing weights, i.e., Wjk > and 

Yl'k=i w jk = 1) an d {4>jk(u)} are some basis functions, say Gaussian, uniform, etc. 
The MGF of each Xj is clearly given by 

(5.2) M X] (t)=J2w ]k I exp{~ut)d<t) jk (u). 

k=l 



u=0 



Now note that if the basis functions {4>jk} are completely known, then the MGF of 
each Yi will again be a polynomial in the mixing weights, {wj k }-, similar in spirit 
to the relation of (4.1). As a result, the complete recipe of Section [4] can again be 



attempted to estimate the weight vectors of the random variables Xj using only the 
IID samples of each Yi . 



In relation to (2.1 1 or (5.1 ), observe next that Vn G the n th moment of each 

Xj is given by 

d i+ 1 rOO 

(5.3) E(A7) = J2 w Jk / u n d<j> jk {u). 

i. i Ju=0 



k=l 



Hence, the n th moment of Y^ is again a polynomial in the unknown weights. This 
suggests that, instead of the MGF, one could use the estimates of the moments of 
Yi to come up with an alternative polynomial system and consequently solve for the 



distribution of each Xj . 



Moving away from the models of (2.1) and (5.1), suppose that for each j g [N] 



Xj ~ exp(mj). Assume that each mean rrij < oo and that rrij 1 ^ m,j 2 when ji ^ j 2 - 
We claim that the basic idea of our method can be used here to estimate mi, . . . ,rnjy 
and hence the complete distribution of each Xj using only the samples of Yi . As the 
steps are quite similar when either i) we know My i (t) for each i € [m] and every valid 
t and ii) we have access only to the IID samples {Yu}i>i for each i £ [to], we take up 
only the first case. 

Fix i G [to] and let pi := {j G [N] : aij — 1}. To simplify notations, let us relabel 
the random variables {Xj : j G Pi} that add up to give Yi as X\, . . . ,X^ i , where 
Ni = \pi\. Observe that the MGF of Yi , after inversion, satisfies 

Ni 

(5.4) Y[(l + tm j ) = l/M Yi (t)- 

3=1 



Using (5.4), we can then define the canonical polynomial 
(5.5) f(x i t):=l[0. + tx j )-c{t), 
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where x = (xi, . . . ,XNi) and c(t) — 1/My ; (f). Now choose an arbitrary set t = 
{ti, . . . , tjy.} C K++ consisting of distinct numbers and define 

(5.6) F T (x) = (/ 1 (x),...,/ JVi (x))=0, 

where /fc(x) = f(x;tk). We emphasize that this system is square of size N%, de- 
pends on the choice of subset r and each polynomial //. is symmetric with respect 
to the variables X\, . . . , x^ i . In fact, if we let c r = (c(ii), c(t N .)) and £(x) = 
(ei(x), . . .,e Ni (x)), where e fc (x) = J2i< 3l<32< ... <3k <N z x h ' ' ' ^ denotes the k th ele- 
mentary symmetric polynomial in the Ni variables x\ t . . . ,xjs[ i , we can rewrite ( |5.6[ ) 
as 

(5.7) T r £(x) - (c T - 1) = 0. 

Here T T denotes a Vandermonde matrix of order N in t 1; . . . , tj^ i with (TV)^. = t^. 
Its determinant, given by det(T T ) = (r[j=i*ij X\.j>i(Pj ~~ i s clearly non-zero. 
Premultiplying (5.7) by T~ l , we have 

(5.8) f(x)-T- 1 (c T -l)=0. 



Observe now that the vector m = (mi, . . . , TOzvJ is a natural root of (5.6) and hence 
of (|5.8[). Hence T 7 T 1 (c T — 1) = £(m). The EPS for this case can thus be written as 



(5.9) Hi(x) = £ (x) - £ (m) = 0. 

We next discuss the properties of this EPS, or more specifically, its solution set. 
For this, let V{Hi) := {x G C Ni : fl"«(x) = 0}. 

Lemma 5.1. V(fl») = n m := {cr(m) : cr e S'jvJ. 
Proof. This follows directly from (5.9). □ 
Lemma 5.2. For ewer?/ x* G V{H t ), det(£(x*)) ^ 0. 

Proof. This follows from the fact that det(£(x)) = Ili<j<fe<iv i ( x i — x k)- ^ 
Because of Lemma |5.1[ it suffices to work with only the first components of the 
roots. Hence we define 

(5.10) Mi := {a'eC: 3x* e V(Hi) with xj = a}, 

which in this case is equivalent to the set {mi, . . . , mjVj}- Reverting back to global 
notations, note that 

(5.11) M t = {rrij : j e Pl }. 

Since i was arbitrary, we can repeat the above procedure to obtain the collection 
of solution sets {Mi : i £ [m]}. Arguing as in Theorem 4.9 it is now follows that if A 
is 1— identifiable, then the rule 

(5.12) m = n m 9 n n m >> 

where Q 3 = {i G [m] : j € pt} and Bj = [m]\Qj, satisfies the relation ip(j) = nij. That 
is, having obtained the sets {Mi : i € [m]}, one can use ip to match the parameters 
to the corresponding random variables. 

This clearly demonstrates that even if a transformation of the MGF is a polyno- 
mial in the parameters to be estimated, our method may be applicable. 
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link 3 




link 4 



(4) (3 

(a) Tree topology (b) General topology 

Fig. 6.1. Networks for simulation experiments. 



>- 4 



6. Experimental Results. We assess the performance of our DT scheme using 
matlab based simulation experiments. 

We consider the simplified network delay tomography setup wherein, given a 
sequence of end-to-end measurements of delay a probe packets experiences across a 
subset of paths in a network, we are required to estimate the delay distribution across 
each link. In particular, we suppose that the topology of the network is known a priori 
in the form of its path-link matrix, denoted A € {0, i} mxAr j and is unvarying during 
the measurement phase. The rows of A correspond to the paths across which probe 
packets can be transmitted and its delay measured, while the columns correspond to 
individual links. Further, the element ay is 1 precisely when the j link is present on 
the i th path. We let Xj, a GH random variable, denote the probe packet delay across 
link j and the delay across path i. We assume that the delay across different links 
are independent. If we let X = (X±, . . . , A^v) and Y = (Yi, . . . , Y m ), then observe 
that Y = AX. Clearly, this setup now resembles the model of Section [2] 

We simulate the networks given in Figure [6] The specifics of each experiment 
and observations made are described next. Note that, unless specified otherwise, all 
values are rounded to 2 significant digits. 

6.1. Network with Tree Topology. We work here with the network of Figure 



6.1(a) Node 1 is the source node, while nodes 3 and 4 act as sink. Path pi connects 



the nodes 1, 2 and 3, while path pi connects the nodes 1, 2 and 4. The path-link matrix 
is thus given by 



A = 



1 1 
1 1 



Experiment 1 . We set the count of exponential stages in each link distribution to 
three. That is, we set d — 2. We take the corresponding exponential stage parameters 
Ai,A2 and A3 to be 5,3 and 1 respectively. For each link, the weight associated with 
each exponential stage is set as given in the columns labeled Wji, . . . , Wj% of Table 6.1 

We first focus on path p\. Observe that its EPS is given by the map of (4.14) 
We collect now a million samples of its end-to-end delay. Choosing an arbitrary set 
t = {1.9857,2.3782,0.3581,8.8619}, we run the first phase of Algo rithm^ to obtain 
Mi. This set along with its ideal counterpart is given in Table 6.2 
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Table 6.1 

Actual and Estimated weights for Expt. Q] 



Link 


Wji 


Wj 2 


Wj3 


Wj! 


Wj2 


Wj3 


1 


0.17 


0.80 


0.03 


0.15 


0.82 


0.02 


2 


0.13 


0.47 


0.40 


0.15 


0.46 


0.39 


3 


0.80 


0.15 


0.05 


0.79 


0.15 


0.06 



Table 6.2 

Solution set of the EPS for path p± in Expt. [7] 



sol-ID 


M 1 


M 1 


1 


(0.1300, 0.4700) 


(0.1542, 0.4558) 


2 


(0.1700, 0.8000) 


(0.1292, 0.8356) 


3 


(3.8304, -2.8410) 


(3.8525, -2.8646) 


4 


(0.1933, 0.7768) 


(0.2260, 0.7394) 


5 


(0.1143, 0.4840) 


(0.0882, 0.5152) 


6 


(0.0058, -0.1323) 


(0.0052, -0.1330) 



Similarly, by probing the path p2 with another million samples and with r = 
{0.0842,0.0870,0.0305,0.0344}, we determine M 2 . The sets M 2 and M 2 are given 
in TableWJh 



Table 6.3 

Solution set of the EPS for path p2 in Expt. Q] 



sol-ID 


M 2 


M 2 


1 


(0.8000, 0.1500) 


(0.7933, 0.1459) 


2 


(0.1660, 0.7775) 


(0.1720, 0.8095) 


3 


(5.5623, -4.5638) 


(5.5573, -4.5584) 


4 


(0.1700, 0.8000) 


(0.1645, 0.7669) 


5 


(0.8191, 0.1543) 


(0.8296, 0.1540) 


6 


(0.0245, -0.0263) 


(0.0246, -0.0259) 



To match the weight vectors to corresponding links, firstly observe that the mini- 
mum distance between M.\ and M. 2 is 0.0502. Based on this, we choose 8 — 0.03 and 
run the second phase of Algorithm^ The obtained results are given in the second half 
of Table \671\ Note that the weights obtained for the first link are determined by taking 
a simple average of the solutions obtained from the two different paths. The norm of 
the error vector is 0.0443. 

Experiment 2. Keeping other things unchanged as in the setup of experiment^ 
we consider here four exponential stages in the distribution of each Xj. The expo- 
nential stage parameters Ai,A2,A3 and A4 equal 5,4,0.005 and 1 respectively. The 
corresponding weights are given in Table \B~4\ But observe that the weights of the third 
stage is negligible for all three links. Because of this, we ignore its presence completely. 
That is, we consider d = 2 and then run Algorithm^ The results obtained are given 



in the second half of Table 6.4 The norm of the error vector is 0.1843. 

This clearly demonstrates that the exponential stages, which are insignificant 
across all link distributions, can be ignored. 
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Table 6.4 

Actual and Estimated weights for Expt. fl] 



Link 


Wji 


Wj 2 


w j3 


Wji 


Wji 


Wj2 


w j3 


Wji 


1 


0.71 


0.20 


0.0010 


0.08 


0.77 


0.20 





0.03 


2 


0.41 


0.17 


0.0015 


0.41 


0.38 


0.18 





0.44 


3 


0.15 


0.80 


0.0002 


0.04 


0.12 


0.70 





0.18 



6.2. 



Figure 6.1(b) 



Network with General Topology. We deal here with the network of 
Nodes 1 and 2 act as source while nodes 3 and 4 act as sink. We 



consider here three paths. Path pi connects the nodes 1,2 and 3, path p 2 connects 
the nodes 1, 2 and 4, while path p 3 connects the nodes 2, 3 and 4. The path-link matrix 
is 



.4 




Experiment 3. The values of d, \±, X 2 , A3 are set as in Experiment^ By 
choosing again a million probe packets for each path, we run Algorithm^ The actual 
and estimated weights are shown in Table \(L~ 



Table 6.5 

Actual and Estimated weights for Expt. [3] 



Link 


Wji 


Wji 


Wj 3 


Wji 


Wj 2 


w j3 


1 


0.34 


0.26 


0.40 


0.34 


0.24 


0.42 


2 


0.46 


0.49 


0.05 


0.45 


0.50 


0.05 


3 


0.12 


0.65 


0.23 


0.11 


0.68 


0.21 


4 


0.71 


0.19 


0.10 


0.69 


0.18 


0.13 



The ease with which our algorithm can handle even networks that have non-tree 
topologies is clearly demonstrated in this experiment. 

7. Discussion. This paper took advantage of the properties of polynomial sys- 
tems to develop a novel algorithm for the GNT problem. For an arbitrary matrix A, 
which is 1— identifiable, it demonstrated successfully how to accurately estimate the 
distribution of the random vector X, with mutually independent components, using 
only IID samples of the components of the random vector Y — AX. Translating to 
network terminology, this means that one can now address the tomography prob- 
lem even for networks with arbitrary topologies using only pure unicast probe packet 
measurements. The fact that we need only the IID samples of the components of Y 
shows that the processes to acquire these samples across different paths can be asyn- 
chronous. Another nice feature of this approach is that it can estimate the unknown 
link level performance parameters even when no prior information is available about 
the same. 

Appendix A. Nonsingularity of coefficient matrix. In this section, we prove 



three results which will together demonstrate the validity of Lemma 4.2 

Lemma A.l. Let uji^uj 2 € Further suppose that Ai,A2 and ti, . . . , t ull+U2 

are strictly positive real numbers satisfying Ai 7^ A2 and tfa tk 2 , if k\ 7^ k 2 . Then 
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the square matrix 
( 1 



(Ai+ti)" 



(Ai+ti) 



W = 



\ (Ai+t„i +u , 2 )«i 



(Ai+t„ 1+U3 ) 



(A 2 +ti)"2 



(A 2 +t 2 ) 



(A 2 +t„ 1+1 „ 2 ) / 



is non-singular. 

Proof. It suffices to show that 



(A.i) det(wo = ( n n 

3=1 k=l 



(A 2 - Aj)" 1 "' 



n 

\l<jl<j2<("l+"2) 



fo'2 ^3l) 



As a first step, we perform on W the row operations 

rowj 4- rowj x (Ai + tj) Ul (A 2 + i,-)" 2 
for each j £ [cjj + £J 2 ] to get the matrix B. Note that for j, k E [oji + 012], 

k £ [u>i] 

k — 1 — 1 



(A 1 +< J ) fc - 1 (A 2 +< J )^ 



(A.2) 



From the properties of determinants, it follows that 



det{W) 



n 



j (Ai+t J r(A 2 +i3) 



otherwise. 



det(fl). 



To verify (A.I I, it only remains to show that 
(A.3) det(S) = (A 2 - Ax)^ JJ 

I<jl<j2<(wi+U 2 ) 

Towards this, our approach is to treat A 2 as a variable and the other indeterminates, 
i.e., Ai, ti, . . . , t UJl+ul2 as constants and show that g(A 2 ) is a univariate polynomial of 
degree ojiw 2 with Ai as its sole root with multiplicity wiw 2 . 

We now introduce some notations. Let 6fc(A 2 ) denote the k th column of B and 
frfc fc (A 2 ) its element-wise rj^ derivative with respect to A 2 . We use <?(A 2 ) and <7 n (A 2 ) 
to refer respectively to det(B) and its n th derivative with respect to A 2 . Lastly, for 
Li,n £ we use A Wj „ to represent the set of non-negative integer valued vector 
solutions of the equation ai -| + a u = n. 

We now prove (A.3 1 by equivalently showing the following: 

i. g n (A 2 )| A _ A = for each < n < WiCJ 2 - 

ii. 9^ (A 2 2 )| Aa 1 =Ai = (^)\Ul< jl<h < iui+ a )2 ) (th ~ tjx) ? 0. 
iii. For all n > U1LJ2, 5™(A 2 ) = 0. 

Firstly note that, for any n G 



(A.4) <f(A 2 ) = 



f*l * ' ' * full -\-UJ2 ' 



det([ 6?(A 2 ) 



(A 2 ) ]) 
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Our strategy is to deal with det ( [ b^ 1 (A 2 ) • • • tfj 1 ^* (^2) ] ) —■ MA2; r) for every 
possible pattern of the tuple r = (ri, . . . ,r Wl -|_ W2 ) e Z+ 1+ " 2 . Although the different 
possibilities are huge in number, the following observations, which follow directly from 



(A. 2), will reveal that in almost all combinations of r either a column becomes itself 
zero or is a scalar multiple of another column. In fact there is only one unique pattern 
where the determinant will have to be actually evaluated. 

1. The column ft^+ifAa) is a constant with respect to A2. This implies that if r Wl +i > 
1, then bj^+i (A2) = and consequently /i(A2;r) = irrespective of what values 
77., k 7^ cJi + 1 take. Hence we need to focus on only those tuples r where = 0. 

2. Suppose that 2 < k < u; 2 and r Ul +i, r Wl + 2 , ■ ■ ■ > ?*wi+fc— 1 

= 0. Then 

r Wl +k > A; 



'r Ux+k \ 

11 (fc-j) ^i+fc-^ 1+fc (A 2 ), 1 < r^+fe < A; - 1. 



In other words, if r Wl+ i, r Wl+2 , . . . , = and r Wl+fc > 0, then ^(A 2 ;r) = 0. 

An inductive argument on k then shows that we need to deal with only those tuples 
where r u>1+ i = ■ ■ ■ = r Ul+u , 2 — 0- For the next 2 observations, we assume that this 
condition explicitly holds for the tuple under consideration. 
3. For column uj\ note that if r Ul > w 2 + 1, then b^ 1 (A 2 ) = 0. But when r Wl = w 2 , 
then b 7 J 1 1 {X 2 ) = uj 2 \^{X 1 +t 1 )^- 1 ,...,(X 1 +t LUl+LJ2 y 1 ^ . On the other hand, 
when A 2 = Ai, 



^!+ W2 (A 2 )| A2=Ai , r Ul =0 

, 1 < r Ul < u> 2 



II ( w 2 - j) I b Ul +u 3 -r ul (A 2 ) 



This implies that /i(A 2 ; r) = if r Ul > w 2 + 1, while /i(A 2 ; r)| A _ x — if < r Wl < 
cj 2 — 1. For our purposes, it thus remains to investigate only those tuples where 
r Ul = u 2 . 

4. The following holds whenever ui\ > 1. Suppose that there exists k £ [loi — 1] such 
that r Ul — r^i-i = • ■ ■ = r Wl _fe_|_i = cj 2 . It is then immediate that 6 I j^ 1 _^(A 2 ) = 

^ ((A x + h)^- 1 ^, . . . , (A x + t^+^H -1- *) if r Wl _ fc = wa, while 6^_t(A 2 ) = 
if r Wl _j; > a>2 + l< For the case when r Wl _fc £ {0}U [w 2 — 1], consider the following 
subcases. 

(a) lui < UJ2 ■ Here note that 



bui 1 +ui 2 -k{X2)\x 2= X 1 : 



r Ul -k = 
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(b) loi > ll>2 : If k < ui 2 , then bj*£ behaves precisely as in the subcase above. On 
the other hand, when uj 2 < k < uji — 1 then 



C+w 2 -fe-r wl -*( A 2 

K7-k ( A 2)Ia 2 =a 1 = - 



A2— Ai 



(^2 



)! 



< r Wl _ fe < oj 2 . 



As a consequence of a simple induction on k, it now follows that fj,(X 2 ;r) 



X2—X1 







for all tuples r, where < r\ < uj 2 and for each 2 < i < fj G {0} U [0*2] . In 
contrast, even if one amongst ri, . . . , r Wl has value strictly bigger than cj 2 , then 
M(A 2 ;r) = 0. 

5. The above observations essentially show that r*, where r\ — ■■■ = r* t = uj 2 
and r* 1+1 = • • • = r* i+W2 = 0, is the only tuple for which fj,(X 2 ;r*) is required 
to be explicitly evaluated. Using properties of Vandermonde matrices, a simple 
calculation show that 

M(A 2 ;r*)| A2=Al = (u, 2 !r II (t k2 -t kl )^0. 



Let A< i0 = {r = (s, 0) e Z" 1 xZ" 2 : 0< Si < cj 2 and < s fe < w 2 V2 < jfe < Wi} 
A r * = {r*} and _4> = Z+ 1+ " 2 
summary of the above observations is that 



Z+ 1+W2 \ {A r - U -4 <i0 } be a partition of Z+ 1+W2 . Then a short 



MA 2 ;r) 



A2— Ai 



0, re A<u 

M(A 2 ;r*)| A2=Ai , rei r>) 



while r e _4> implies that /i(A 2 ;r) = 0. This last conclusion implies that (A. 4) can 
be explicitly written as 



S"(A 2 



E 



n 



A^(A 2 ;r) 



reA„ 1+ „ 2i „n{yl r ,uy(< ; n} 



Now note that Vr e -4<,o, Sfcii 2 r fe < "1^2, while Y^k=i 



W1W2. This shows 



that A^ 1+W2i „ n.A<,o = for all n > lo\uj 2 , while A Wl+W2! „ n„4 r . = for all n 7^ lo\uj 2 . 

It is now trivial to see that .9™(A 2 )| A2=Ai = V0 < n < <jj x uj 2 , 5" 1LlJ2 (A 2 )| A2=Ai = 
(uJi^ 2 V-l\i<k 1 <k 2 <u Jl +Lu 2 (*fca ~ and 5™(A 2 ) = for all n > wiw 2 + 1- This estab- 
lishes the desired result. □ 

The general version of the above result is the following. 

Lemma A. 2. For a fixed integer d > 2, let u}%, . . . , LJ4 be arbitrary natural 
numbers. Let Sq = and for k € [d], Sk = Sfc-i + w fc- Further suppose that X\, . . . , 
and ■ ■ • j ^Sd are strictly positive real numbers satisfying A^ 7^ Aj 2 , i/ ii 7^ i 2 
and tjj 7^ tj 2 , if ji 7^ j'2 ■ Then the square matrix 



( 



(A.5) W = 
is non-singular. 



(Ai+ti) 



(A 



_J \ 

d+tl) 



(Ad+tsJ / 
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Proof. It suffices to show that 
(A.6) det(W) 



TT TT {il,i2)€ld 

1111 (Ai+t,-r . 



where Id = {(«i,i 2 ) : ii,«2 € [d], ii < i 2 } and J = {(ji, j 2 ) : ji, J2 G [S*d], ji < h] ■ 
As a first step, we do the row operations 

d 

rawj <— row, x JJ^ (A, + tj) u * 



for each j 6 [S d ] on VF to get the matrix B. For j, k 6 [Sj], note that 



n (Ai+tj-r (Ai+t 



vfc-So-1 



So + 1 < fc < 5i 



(A.7) (B) ifc =^i 



II (Ai+^-r (Aatj)* - ^- 1 '" 1 , S (<J _i) + 1 < k < S d . 
{ {ie[d]\{d} J 



To verify (A.6), it suffices to show that 



(A.. 



<1<-UL')= [ [] (Ai, - AiJ" 

, (ii,i 2 )Gl d 



n (th-tn) 



For a fixed d, let C<j denote the collection of matrices of the form similar to B for 
all possible choices of u>t, . . . , uj d , Ai, . . . , A^, <i, ■ • • , t$ d satisfying the given con- 
ditions of the lemma. We will say Cd is non-singular if the determinant of each 



matrix in this collection is given by (A.8) and hence is non-zero. Now let Af := 
{d G : C d is non-singular} . In these notations, a claim equivalent to (A.8) is to 



show that if d > 2, then d € Af. We prove this alternate claim using induction. 

From Lemma [A. 11 it follows that 2 € A/". We treat this as our base case. Let the 
induction hypothesis be that d — 1 G Af for some d > 3. To check if d G Af, we 



verify whether (A.8) holds true for the determinant of an arbitrary matrix in C d . For 



convenience, we reuse the symbol B to denote this arbitrary matrix. In relation to 
B, let g(X d ), g n (X d ), b k (X d ), b r k k {X d ), r = (n,...,r 5d ) e A Sd<n and ^{X d ;r) be 
defined as in the proof of Lemma |A.1| 



Our approach is similar in spirit to that used in verifying (A. 3). That is, we treat 



Xd as a variable and the other indeterminates as constants and, using the induction 
hypothesis, show that 

i. g(Xd) is a univariate polynomial of degree Sd-iOJd and for each m G [d — 1], A m 
is a root with multiplicity exactly uj m uJd, i.e., g(X d ) = h x ] J (Ad — Xi) UJiUld . 

ie[d-l] 

ii. ft = ( n (a« - A n )<^ 2 ) x ( n - ) ■ 

\(ii,i2)£I (<l -i) / \Uij2)eJ ) 

To begin with, observe that 



(A.9) 



9' 



'(A d )= E 



MAd;r). 
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Let us now f ix an arbitrary m G [d — 1] . Taking hints from the observations made in 



Lemma 



A.l 



we define r* G 7L S J by 



'ud, S^m-i) + 1 < k < S m 
0, otherwise. 

We next partition Z + d into the three sets A<.q, A r *, and A> given respectively by 

A<.o = {r : < r S(m _ 1)+1 < cj d ; V2 < k x < cu m , r S(m _ 1)+fel G {0} U [u d ] 

and r k2 = if S( d -i) + 1 < k 2 < S d } , 

A^ = {r*} and A> = 1 s + d \{A r * U-4<o}. With respect to A<. note that, when 
I < k < S'( m _ 1 ) or S m + 1 < k < S(d-i)i there is no restriction on r^. This necessarily 
implies that for each r G >4>, r k > LOdLo m . Hence these tuples do not appear in 
the expansion of g n (\d), as given in (A.9), whenever n < u m ujd- So we ignore Ay for 
the time being and determine the value of n(\d] r) for tuples lying in the other two 
sets using the definition of B given in (A.7). 
1. re A<.o : We consider the following subcases, 
(a) rs m < uJd '■ Here observe that when Xd = A m , 



n fa-!) 

. 3=0 J 



> b S d ~r Sm (Ad) 



A rf = A„ 



r S„ 



rs m € [Ud - 1] . 



(b) There exists k G [uj m — 1] such that rs m — rs m -i = ■ • • = rs m -k+i = and 
r S m -k < ^d ■ For this subcase, observe that if u> m < Ud, or uj m > u>d and 
1 < k < Ud-, then 



' b Sd -k(\d)\ 



A rf =A m i 



r (.S m -k) — 1 

n (o>d-j) 

v 3=0 



>S d -k-r< 



-*)( A <i)L=A„ 



(ui d -r Sm -kV- 

On the other hand, if ui m > UJd and Ud < k < U3 m , then 



r(S m -k) = 

1 < r(s m -k) < ud - k 

oJd-k < r Sm ~k < Ud 



fc fc) (Ad)| Ad ^ 



b sL+u d -k-r (Sm _ k) ( X d) 



Ad=A„ 



{uj d - r Sm -kV- 



< r Sm -k < u d - 



From this, it follows that for each r G -4<,o there exists a pair of columns which 



are linearly dependent when A^ = A m and thus n(Xd', r)| 



A rf =A„ 



0. 
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2. r € A T * : Let V" denote the matrix obtained after differentiating element-wise the 
individual columns of B up to orders as indicated by r* and substituting A^ = A m . 
It is easy to see that for j, k e [Sd] , 



n (Ai+^-r 

\ l e[d]\{fc,m} / 

(A b + ^)^ S<6 - 1, " 1 (A m + t J T 



(^) ifc = \ \ie[d]\{m,d} , 



n (Ai+*i) 

, ie[d]\{m,d} 



5(6-1) < k < S b 
for some 6 S [d]\{m,d}. 



S(m— l) <■ k < S n 



S(d-i) < k < S d 



{ (A m + t i r m+fe ~ s ° i - 1, ~ 1 

Now observe that if we define the matrix V from V as 



(to 



l<fc<S (TO -i), 

WjMm+Vi))' #m + 1 < fc < S m + W d 

k 00 j 5 m + w d + 1 < fc < Sd, 

then V € Cd-i- Prom the induction hypothesis, it follows that 



d-l 



dctV = (-I)^-d-^-d | Y[(\ m - \j) 

n (Aj 3 — Aj 1 ) wii< ' 

v (ii,i 2 )el( £i _i) 



0"U2)eJ r 



Consequently, 

MA d ;r*)| 



det(F) 



' d-l 



\\ (A m - Xj 



v (u,j2)62T(<i-i) 



Now for each n < uj m ujd, note that As d .„ C -4<.o while for n = ui m u>d, ^-s d ,n C 
^l< i o U -A r *. From the above observations and (A.9), it then follows that, for each 
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.1—2 x 



< n < uj m uj d , g n (Xd)\ x =Xm = while 

(A.10) g u ™"«(\ d )\ Xd ^ m = (u }m u> d y.l II (Ai 2 -A ix )" 

\(ii,i2)ei( t! _i) 

n (*i»-*ix)] (n^-^r^ 

Clearly, g LJmUJd (Xd) | ^ _ A 7^ 0. Thus, A m is a root of ^(Ad) of multiplicity u) m Ud- 

Since m was arbitrary to start with, it follows that for every i £ [d— 1] , (A^— Ai) Wi " d 
is a factor of g(A<j)- This suggests that original structure of g(Xd) must have been of 
the form g(Xd) = h(X d ) Y[ie[d-i] (^d ~ Xi) u)zUJd for some univariate polynomial h(Xd)- 
The fact that the determined multiplicities of the roots of g (Ad) are exact ensures 
that h(Xi) 7^ V« £ [ef — 1], Thus h is not an identically zero polynomial. 
It remains to show that h is a constant and in particular 



(A.ll) 



n ^ 

v (ii,i2)£Z(4_i) 



n (** 



"31 j 



Towards this observe that if n > Su-Vj^d + lj then for every tuple r 6 As di „ either 
one of rs (d _ 1)+1 , . . . , rs d is strictly greater than zero or one amongst ri, ... , Ts, d _^ 
is strictly bigger than u>d- In the former situation, arguing as in observations (1) and 
(2) from Lemma ( A.l ), it is easy to see that fi(Xd', r) = 0. For the latter situation, let 



us suppose that for some arbitrary 1 < ko < S7d_i), r k > <+>d- The fact that every 
element {B)jk of column ko is a polynomial in Xd of degree u d immediately shows 
that b r k k °(Xd) = and hence //(A^r) = 0. Consequently it follows that g n (X d ) = 
Vn > S(d-i)0Jd- That is, the degree of g(Xd) is exactly Su-i^d- In other words, 



(A.12) 



l(X d ) 



hx n 

ie[d-1 



(X d - A,)" 



for some constant h 7^ 0. Differentiating (A.12 1 up to order ui m u>d for some arbitrary 
m £ [d — 1] and comparing with (A. 10), (A.ll) immediately follows as desired. □ 



We now finally prove Lemma 4.2 by showing the following general result. 

Theorem A. 3. For a fixed integer d > 2, let w\, . . . , u>d be arbitrary natural 
numbers. Let So = and for k £ [d], Sk — Sk-i + (^k- Further suppose that Xx, . . . , 
Xd and t\, . . . , t$ d are strictly positive real numbers satisfying A^ 7^ Aj 2 , if i\ 7^ i2 
and tj 1 7^ tj 2 , if j% =/= ji . Then the square matrix 



(A. 13) M 



(Ai+ti) 



\ (Ai+ts d ) 



(Ai+ti)"i 



(Ad+ti) 



2<i , 



■ (Ai+ts d )-i 

is non-singular. 

Proof. For fc £ [Sd], let 6^ := min{i £ [d] : Si > k}. Now using the expansion 
X 71 ' 1 = Eg^o (- 1 ) ? ("g 1 )* 9 ( A + i)" -1-9 , for n € Z++, observe that 



(A.14) 



9=0 



cf/n-lA 



A n_1 t 



« ; (A + t)?+! (A + <)™' 
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Using this note that, for any j, k £ [Sd], 

g=0 



A 



fc-Sf, 



fe— Sn 



From this it follows that if, for each is [d], we perform the column operations 



fc-%- 



col fe ^ ^ (-!)« | 



fc-%_D-l 



)cols Ci _ i: 



+1+-/ 



5=0 



in the order k = Si, Si — 1, ... , Su-i) + 1, then we end up with matrix 



U 



(Ai+ti) 



\ tSd 

\ (Ai+*s d ) 



A" 1 
(Ai+ti)"i 

(Ai+tsJ 1 " 1 



A , 



(Ad+ti) 
(Ad+tsJ 



(Ad+ti)"<i 



Since only reversible column opertions are used to obtain U from M, it suffices to 
show that U is non-singular. But this is true from Lemma |A.2| The desired result 
thus follows. □ 

Appendix B. Coefficient expansion. 

LEMMA B.l. For any u> £ Z+, ifD(uj) ^ 0, then 



(B.l) 



A "= E Ew^w 

feGX>(o)) 9=1 



/or a^Z t. Further, this expansion is unique. 

Proof. The uniqueness of the expansion is a simple consequence of Lemma A.13 
We prove (B.l I using induction on dcg(u;) = J2k=jJ£k- 

For deg(o;) = 2, our basis step, we verify (B.l I by considering the following 
exhaustive cases. 

1. there exists a unique k € [d] such that uj k = 2: Here, with T>(uj) — {k} , observe that 
Aki(uj) = and K k2 {u}) = {0 e Z^} . Consequently, 711(0;) = and 712(0;) = 1. 
Equation (B.l) thus trivially holds. 

2. there exists unique k\, fc 2 € [d] such that uj kl — uj k2 = 1: Here observe that = 
{h,k 2 }, A fcl i(o>) = A fc2 i(a;) = {0e Z c {) , 7 fel i(o>) = /3 klk2 and 7fc 2 i(o>) = /3 k2kl . 
Equation (B.l) clearly holds true for this case since Vi 

A" = A kl (t)A k2 (t) 

= /3fcifc 2 Afci(*) + Pk 2 k 1 A k2 {t) 

= 7fc 1 i(w)A fcl (t) +7fc 2 i(^)A fc2 (i). 



Now for some fixed n > 2, let the strong induction hypothesis be that (B.l ) holds for 
all u; such that deg(u>) < n. To verify (B.l) when deg(u;) = n + 1, we consider the 
following exhaustive cases. 

1. there exists a unique fc £ [c?] such that = n + 1 : Here £>(w) = {A:}, 

A f \ J ' 0<?l + l 

Afe9M ^1{0eZ^}, ff = „ + i 
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and consequently 



0, q < n + 1 

1, q = n + 1. 



Using these note that (B.l) again holds trivially. 
2. there exist unique fci, &2 G [d] such that uj^ , Wfe 2 > and ujj ei + uik 2 = n+1: The 
fact that n > 2 immediately ensures that at least one of u>k 1 , Wfc 2 is strictly big- 
ger than unity. Without loss of generality, let us assume that fci = 1, fe 2 = 2 
and U\ > 1. Observe then that T>(u>) — {1,2}. Also, for each q 6 [<^i], we 
have A lg (u>) = {(0, w\ — q, 0, . . . , 0) G , a singleton set. Hence 7 i g (u>) = 
ffi^^ffi-*. Similarly, Vg G M, 72(? (u,) = /^^"X^- Ob- 
serve next that if :— u>x — 1, a positive number, then A" can be written 

as Ai(£) jA^'j, where u/ = (wi, w 2 , 0, . . . , 0). Clearly deg(w') = n, 7i g (<*>') = 

^irri-T 1 )^" 9 , Vg G K], and 72? (u/) = ("^J'- 1 )/^', Vg G [a*]. 
By induction hypothesis, it then follows that 



A- = ^ 7lg ( W ')Af +1 (t) + ^ 72g (a,')A 1 (t)Al(f) 

g=l q=l 

(B.2) := tcrmi + term2. 

Since uj[ = u>i — 1, it follows that 

ternn = £ ^| ("* ^I*- 1 ) ftC* A.? 1 (t) 

gr=2 

(B.3) =f>,(«)A?(t). 
For term2, we consider two subcases. 

(a) W2 = 1 or equivalently wi = n: Here, term2 = fi^ 1 Ai(t)A2(t) . By additionally 
using the base case, it follows that 



term 2 = ft^faX^t) + /^A 2 (t) 
(B.4) = 7 ii(w)Ai(t)+ 72 i(u;)A 2 (t) 



Substituting (B.3) and (B.4) in (B.2), it is clear that (B.l) holds 



(b) UJ2 > 1 : We again apply the induction hypothesis individually to the term 
Ai(i)A2(t) for each 1 < q < u)2- Note that none of these expansions yield 
scalar multiples of A\(t) for any 2 < q < W\ and, thus, their associated weights 



obtained in (B.3) remain unchanged in the overall decomposition of A w . The 
term Ai(t) is, however, definitely present in the final decomposition of term 2 . 
For each q G [w 2 ], observe that the constant associated with A%(t) in the ex- 
pansion of Ai(t)Af(i) is 7 ii(l, q,0, . . . ,0) = 0f 2 . From the definition of term 2 , 
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clearly, the constant associated to Ai(t) in the final decomposition of A" is 



— q— 1\ 
I u',-1 ) 



9=1 



9=1 



-P21P12I ^ J> 

_ 0W1-I0W2 to+wi-i-n 

^21 ^12 V CJ2 — 1 ' 

= 711 (<■•>)■ 

Note that the second equality above is a consequence of the well known bi- 
nomial identity J2r=j (j) = (j+i)- Next observe that the decomposition of 
Ai(f)A 2 (t) by the induction hypothesis also results in a linear combination of 
A 2 (i), A|(t), . . . , A|(t). This implies that 



(B.5) 



term 2 = 7H (a>)Ai(t) + E V^K*) 
9=1 



for some t— independent constant By substituing (B.3I and (B.5 1 in (B.2), 
it follows that 



(B.6) 



A" = 5% g (w)A;(i)+5>,A2(t). 

g=l ? =1 



Now recall that cj 2 > 1 in the subcase under consideration. Expressing A" as 
A 2 (i) {A^^Aj 2 " 1 ^)} and interchanging the roles of A x (t) and A 2 (t) in the 
above argument, it then follows that 



(B.7) 



A^EMiW + E^MA^) 

9 =1 9=1 



for some t— independent constants {u q }. Comparing (B.6) and (B.7), we get 



(B. 



E {719M ~ "9} A?(t) + A|(t) = 

9=1 9=1 



for all i. A simple application of Lemma A. 13 then shows that u q — 71^(0;), 
Vg £ [cji], and v q — 72 9 (<*>), Vg € [w 2 ]. By substituting these values in either 
(B.6) or (B.7), it follows that IB .11) holds for this case also. 



3. Cardinality of the set T>(lj) = {k € [d] : Wfc > 0} is atleast 3 : Without loss of 
generality, let us suppose that wi,w 2 > 0. Our approach here is to express A" as 
A" 1 (t) {A 2 2 (i) . . . A^ d (t)} and expand the term within the braces using the induc- 
tion hypothesis. With ui" = (0, cj 2 , . . . , Ud), it follows that 



[*ex>(u»)\{i} 9=1 

E f>9K)Ar(t)A^) 

fceT»(o>)\{i} 9=1 

E term fe . 

fce-D(u,)\{i} 
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A neat observation, by virtue of our induction hypothesis, is that the eventual 
constant associated with A q k (t), k G T>(u))\ {1} , q G [u>k]> depends solely on term fc . 
Infact the only terms that contribute a scalar multiple of A?(i) are A" 1 (f)A?(i), 
A^(t)A« +1 (i), . . . , A^(t)A^ k (t). With this in mind, we focus on term 2 and evaluate 
the eventual constant, say a, associated with A 2 (i) for some arbitrary q € [w 2 ]. 
Firstly observe that the scalar multiple of A 2 (i) in the expansion of A" 1 (i)A 2 +J (t), 
when < j < u>2 — q, is 7 2(3 (wi, q + j, 0, . . . , 0). This implies that 

a = E 72(g+j)(V)72g(wi,9+ j, 0, ...,0) 

i=o 

: = E 

In relation to each cj, observe that A 2g (u;i, q + j, 0, . . . , 0) = {(j, 0, . . . , 0)} , a sin- 
gleton set, and for any s = (s l7 . . . , s d ) G K 2 (q+j) s i = and J2 r =2 s r = w 2 — 
g— j. This implies that s € A 2 ( g+J ) if and only if the tuple (j, s 2 , s 3 , . . . , G 
A 2g (u;) f]Bj, where := {seZf : s x = . Thus, 

n ^ e n (^-r 1 )^- 

Consequently, it follows that a = 7 2(? (u;). By replicating this argument for every 
k € and every d € [w*.], observe that the weight associated with A^(i) in 

the final expansion is 7fc 9 (a;). This, combined with the fact that the decomposition 
of A^ 1 (t)A q k (t) also results in a linear combination of Ai(t), . . . , A" 1 (i), shows that 

A"=5> 9 A?(t) + £ £ 7fc9 (u,)A«(i) 
g=i fcex>(w)\{i} <j=i 

for some t— independent constants u\, . . . , u Ul . But observe that A" can also be 
expressed as A 2 2 (t) {A" 1 (^Ag 3 (i) . . . A^ d (i)} . By repeating the above arguments 
with an interchange of the roles of Ai(t) and A 2 (i), we, thus, also get 

A" = f>,A2(t)+ X! E7fe,MA2(t) 

«=1 feeX)(w)\{2} <?=i 

for some real constants v%, . . . , w W2 . Arguing as in case (2b) above, it is easy to see 
that u q = 7i g (w), Vq £ [cox] and v q = 7 2<3 (w), Vq € [w 2 ]. 
This completes the induction argument and, hence, proves the desired result. □ 

Appendix C. Regularity of EPS. For discussions pertaining to this section, 
we suppose that the domain and the co-domain of the map £, used to define the EPS 



of (4.131, is C d ' Ni . We now prove Lemma 4.4 through the following series of results 



Lemma C.l. Suppose ai, . ..,0^ ar e iV, distinct complex numbers. Let a = 
(ai, . . . , ajvj, where a.j = (aj, . . . , ay) G C d . TTien 

(C.l) det(£(x))| x=a = (a n ~a 32 ) d ^0. 

i<h<j2<Ni 
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Proof. Pick an arbitrary Jl g A^ +1 jv i; j 2 £ [N] and fc 2 £ [d]. We first show that 



(C.2) 



9g(x; n) 



9s j 2 k 2 



0, w fc2 = 

ks)e v ^ u )(ai, . . . , a j2 -!,a j2+1 , . . . , a Ni ), otherwise, 



where i/ ( ; /c 2 ) is a number, independent of choice of the vector a and j 2 G [Ni]> 
ri{sjj) — (ryl—i uJkj —l€ Z + and e/(-), for < I < Ni — f, is the I th elementary 

symmetric polynomial in N — 1 variables. 

Suppose that uj k2 — 0. Then clearly, Vb £ £fo, k2 (h) = 0. This implies that 
g(x; ST) is a constant with respect to xj 2 k 2 - Thus, %r~~| x=a — as desired. 

Now suppose that u k . 2 £ [N]. Clearly, 



n ) = x hk 2 E 



n 



x jbj 



E 



n x ^ 



Hence, 
(C.3) 



G> 5 (x;ft) 



dx 



32^2 



E 



n 



Now for any S C [iV,], such that |<S| = Wd+i, define the set Sn(j2j fc2,<->) C £fo by 

B n (?2, *2,«5) = {be[d+ 1]^ : b 32 = k 2 ; Vj € 5, 6, , = d + 1; 9(b) = tt} . 
Its cardinality, given by 
(C.4) \B n (j 2 ,k 2 ,S)\ = ( Ui< 



Nt-l-u 



, w^fcj-i), w fc2 -l, W( fe2+1 ), ••• , w d , 0/' 



clearly depends on |<S|, but not on S itself. Also, it is independent of choice of the 
vector a and j 2 G [A^]- By defining 



(C.5) 



i/(n;fe) = |Bo(ja,fe,5)| 



and using the fact that { b £ B n ■ b h = k 2 } = Usc[JV 4 ], |S|=a> d+ i s n(j2,k 2 ,S), a dis- 
joint union, observe from (C.3) that 



9 3 (x; n) 



Ni 



dx 



hk2 SC[N i ] l \S\=u d+1 



E "(«;*a) II 



fc 2 )e^( u) )(ai, . . . , oy 3 _i),a(j 2+ i), . . . ,ojv 4 ), 



as desired. This now completes the verification of (C.2). 

Now fix an arbitrary ki,k 2 £ [d] and qi,j 2 € [N]. As of consequence of (C.2), 
observe that 



(C.6) 



dh kiqi (x) 
dxj 2 k 2 



E 

w fc2 > 



7 fcl91 (aj)^(n;fc 2 ) 

. e^( w )(ai, . . . , etja-i, Oja+i, • • • , ajVi) . 
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By grouping the symmetric polynomials in the above equation that have the same 
degree, i.e., identical values of r)(u)), we get 



(C.7) 
where 



dh kiqi (x) 



dxj 2 



Ni-l 

= ^ ci{ki 1 q ll k 2 )ei(ai 

x=a l= qi -l 



i ■ ■ • ! a 32 — 1; • ■ ■ i 



ci(ki,qi,k 2 ) := 



[ 7fcigi(») 



Using (4.6 1, ( C.4 1 and (|C.5|, note in particular that 
(C.8) 



c 9l _i(fci,gi,fc 2 ) 



For every k\ , fc 2 € [JVj] , with 



(C.9) 



/ 3/tfc 1 i(x) 



hkjNj (x) 



1, fci = fc 2 
0, otherwise. 



feii( x ) 



dx 



consider the matrix 



(CIO) 



N t k 2 



dh klN . (x) 
dx N k 



"l,d 



S:=£(x) 



i/(n ; fe 2 ). 



Using elementary row operations, our strategy is show that 



(C.ll) 



where 



(C.12) 



det(£(x)) = det 



/ r o 
o r 



\ o o 

e (a 2 , . . . , ajvj 







(det(r)) 



e (ai, . . . , ajv 4 -i) 



r = 



ewi-i(«2, • • • , ffljv 4 ) ••• e^-i^i, ojVi-i) 



This is clearly sufficient to verify (C.l) since det(r) = Y\ 1< j 1< j 2<N .(a,j 1 — Oj 2 ) is a 
well known result. 

For notational convenience, the matrix obtained after applying elementary oper- 
ations on 3 will again be referred to by 3. The notation row(r, k\) will mean the r th 
row of the k\ h block matrix row of S, i.e. the r th row of 

[ Efci.l • • • ^ki,d ]■ 

Similarly col(c, fc 2 ) will refer to the c th column of the fc 2 block matrix column of 3. 
We will say that the matrix 3 is in state n, n g [JVj], if: 
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1. det(3) = det(£(x)). 

2. for each k\ € [d] and q € [Ni], the g*' 1 row of (Sj^ fc 2 — T), where T is as defined in 
( |CT12| ), is 

JVi-n iVj— n 

[ J] ci(ki,q, kx)ei{a 2 , ■ ■ ■ ,a Ni ) ■■■ J2 c i( k ii <7, fci) e i( a i> ■ ■ ■ , ajv«-i) ]• 

3. for each fci, € [d], such that /ci ^ k 2 , and each g £ [N], the g*' 1 row of 3fc li & 2 is 

[ ]T Ci(fci,g,fe 2 )ei(a2,...,Oiv j ) •■• I] cj(fei, q, k 2 )ei(a\, . . . , ajv«-i) ]. 



From (C.7) and ( C.8 1 , it is clear that 3, as given in (C.lOl, is in state 1. To verify 



(C.ll), it suffices to show that the state of 5 can be changed to Ni using reversible 
elementary row operations. We do so by describing the reversible row operations 
needed to put 5 in state n + 1 starting from a state n, where 1 < n < N. 

Suppose that ki = k 2 . Then from the definition of 5 being in state n, it follows 
that when N — n < q < N, the q th row of E.k 1 k 2 is given by 

[ e g _i(a 2 , ...,a Ni ) ••• e 9 _i(ai, . . . , a^-i) ], 
while, for 1 < q < Ni — n, it is given by 

Ni-n Ni—n 

[ I] c l {k 1 ,q,k 1 )e l {a 2 ,...,a N .) ■■■ c i(ki, a,fci)ei( a i> . . . , ajVi-i) ]■ 

Z=g-1 i=q-l 

On the other hand, when fci 7^ k 2 , the g th row, N — n < q < N, of is G C^, 

while for 1 < q < Ni — n, it is given by 

[ J2 ci(ki,q,ki)ei(a 2 ,...,a Ni ) ■■■ J2 ci(h, q, fci)e;(ai, . . . , a Ni -%) ]. 

l—q l—q 

From these observations, it is clear that the effect of subtracting a scalar multiple of 
TOw(Ni — n + 1, k), for some k 6 [d] , from any other arbitrary row of 3, say row(r, fci), 
where fci € [d] and r € [iVi], is only on the elements whose column index is one of 
col(l,fc), . . .,col(JVi,fc). 

Now fix an arbitrary k G [d] and consider the following row operations: 

1. tow(N — n, k) <— row(Ni — n,k) — cjy 4 -n(fc, Ni — n,k)x row(iV; — n + 1, k). Leaving 
all other elements unchanged, this operation changes the (N — n) th row of the 
matrix 3j.^ to 

(di, . . .,0^-1) ]• 

2. row(g, fc) row(g, k) — CN i - n (k, q, k) x tow(N — n + 1, k) for each 1 < q < N — n. 
For every q, leaving other elements unchanged, this operation changes the q th row 
of S k ,k to 

ATi-(ra+l) iVj-(n+l) 

[ ci(k,q,k)ei(a 2 ,...,a Ni ) ■■■ J2 ci(k, q, fc)e/(oi, . . . , a Ni -i) ]. 

i=g-l (=g-l 

3. row(A r j — n, fci) row(iVj — n, fci) — c^- 4 _ n (A;i,iVj — n, fc) x row(A r i — n + 1, fe) for 
each fci € [d]\fc. For every fci, leaving other elements unchanged, this operation 
changes the (iVj — n)*' 1 row of the matrix 3^/. to € C^' . 
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4. row(g, ki) <— row(o, k\) ~ cjv 4 -i(A;i, q, k) x row(iVi — n + 1, k) for each fci 6 [d]\k 
and each 1 < q < iVj — n. For every ki,q, leaving other elements unchanged, this 
operation changes the q th row of Sk lt k to 



JV«-(n+i) 

X] ci(k 1 ,q,k)ei(a 2 , 

l=q 



Ni~(n+1) 

,a Nl ) ■■■ ci(ki,q,k)ei(ai, 

l=q 



, a Ni-l) 



At present, the matrices Sj^, . . . , E^fc are in the form as required when the S is in 
state n + 1. By repeating the above operations for each fc € [d], the entire 5 matrix 
can thus be put in state n + 1 as desired. □ 

We now recall some standard results from numerical algebraic geometry. Lem- 
mas C.4 and C.5 may be found in [20], Theorems C.6 and C.7 may be found in [24] 



while Theorem C.8 may be found in [12] or [17] . 

Definition C.2. A set V C C" is said to be Zariski closed if there exist 
fx,..., fk S C[xi, . . . ,x n ], such that 



V = {x = (ajj 



= / fc (x) = 0}. 



Definition C.3. A set V C C" is said to &e Zariski open if its compliment is 
Zariski closed. 

By replacing C with K everywhere in the above definitions, Zariski closed and 
Zariski open sets of K™ can be equivalently defined. 

To distinguish from the open and closed sets of the usual Euclidean topology, 
the open and closed sets of the Zariski topology with always have the word Zariski 
prefixed to them. 

Lemma C.4. A Zariski open subset O ofW 1 (or C) is also open in the usual 
Euclidean topology ofW 1 (orC n ). Further, if O is non-empty, then it is also dense 
in the usual topology. 

Lemma C.5. Let O be a non-empty Zariski open subset ofC n . Then O fll™ is 
a non-empty Zariski open subset ofW 1 . 

Theorem C.6. For a polynomial system F(x) = (/i(x), . . . , /„(x)) = 0, where 
fk '. C n — > C, the total number of its isolated solutions, counting multiplicities, is 
bounded above by its total degree, i.e., the product deg(/i) • • -deg(/ n ). 

Theorem C.7. For k g [n], let / fc (x;q) : C n x C m — > C be a polynomial 
in both x and q. Then for the polynomial system F(x; q) = 0, where i^(x;q) = 
(/l(x; q), . . . , /„(x; q)), there exists a non-empty Zariski open set O C C m such that, 
for each q € 0, the system has r, isolated solutions of multiplicity i, where fj is an 
integer independent o/qeO. 

Theorem C.8. Let F : C d - m -> £ d ' N ' be a polynomial map. Then there exists 
a non-empty Zariski open set U of the co-domain such that for any u € U, if z is a 
root of F(x) — u = 0, then F(z) is non-singular. 

We are now ready to prove Lemma |4.4| But we first show its generalization in 

Lemma C.9. There exists an open dense setCi ofC d ' Ni such that ifw £ Ci, then 
the solution set of the EPS given in (4.13) satisfies the following properties: 

1. \V{Hi)\ = k x Nil, where k 6 Z ++ is independent o/w £ Ci, and 

2. Each solution is non-singular. 

Proof. Consider the Zariski open set C 1 := {x € C d ' Ni : det(£(x)) 7^ 0}. From 
Lemmas C.l and C.4 it is clear that C 1 is non-empty and hence an open dense subset 
ofC*^. 
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For the map £ that is used to define t he EP S, let U C C d ' Ni be the non-empty 
Zariski open set as guaranteed in Theorem C.8 Also, let C 2 :— Clearly, if 



w G C 2 , then the EPS necessarily has only non-singular roots. Thus, C 2 CC 1 . But 
we now show that C 2 itself is an open dense subset of C d ' Ni . Since U is open and £ is 
continuous, it follows that C 2 is open. To show that it is dense we assume the contrary. 
Then clearly, there exists a 5 > and x G C d ' Ni \C 2 such that B(x; 8)C\C 2 = 0. This 
implies that 

(C.13) £ (B(x; 5)) n 17 = 0. 

But C 1 is an open dense subset. Thus, $(x; i5) nC 1 7^ 0. From the inverse function 
theorem, it follows that, for any z G S(x; 5) PI C , £ is a local homeomorphism at z. 



Thus, £(S(x;<5)) has non-empty interior. Equation (C.13) then contradicts the fact 
that U is an open dense subset. Hence, C 2 is also open dense. 

For the map F(x, w) : C d JV ' C d ' N > defined by F(x,w) = £(x) - £(w), let C 3 
be the set as guaranteed in Theorem C.7 Then clearly, Ci := C 2 DC 3 is an open dense 



subset of C d ' Ni . Furthermore, if w G Ci, then 

i. the EPS has only non-singular roots or equivalcntly n — 0, Vi 7^ 1, and 

ii. there are precisely r\ G Z ++ non-singular roots, where r\ is independent of w G 
Ci. 

Note that r\ > 1 since w is always a root. Because of Theorem |C.6[ it is also finite. 



From the symmetry of the EPS (see Lemma 4.3 ) and the non-singularity of the 
roots, it is now easy to see that = k x iV,! for some k G □ 

Let C 1 ,C 2 ,C 3 and U be defined as in the proof above. From Lemmas C.4 and 



05| it follows that K 1 := C 1 n R d ' Ni , U n M^'^ 1 and ^ 3 := C 3 n R d ' Ni are open dense 
subsets of M. d ' Ni . The same argument that was used to prove C 2 is an open dense 
subset of <C d ' Ni , with the real version of the inversion of the inverse function theorem, 
also shows that K 2 := £ ^{U H R d ' Ni ) n R d ' N * is an open dense subset of R dN *. But 
note that all coefficients of £ are real. Hence, TZ 2 = £^{U) n M d Ar * = C 2 n R dN \ 
This now implies that TZ, := ^ 2 n TZ 3 = C, n is an open dense subset of R dN \ 

Since IZi C Ci, even if w G Kj the EPS has the same properties as those described in 



Lemma C.9 This now completes the verification Lemma 4.4 as desired 
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